Overview

Dataset Statistics

Number of Variables 10
Number of Rows 532
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 83.9 KB
Average Row Size in Memory 161.4 B
Variable Types
  • Categorical: 3
  • Numerical: 7

Dataset Insights

Patent_count is skewed Skewed
Turnover_lay is skewed Skewed
Turnover_2012 is skewed Skewed
Total_assets_2012 is skewed Skewed
Employees_2012 is skewed Skewed
R&D_2012 is skewed Skewed
Country_code is skewed Skewed
ID has a high cardinality: 469 distinct values High Cardinality
Patent_industry has constant length 1 Constant Length
University has constant length 1 Constant Length

Variables

ID

categorical

Approximate Distinct Count 469
Approximate Unique (%) 88.2%
Missing 0
Missing (%) 0.0%
Memory Size 50.5 KB

Length

Mean 31.3647
Standard Deviation 16.7708
Median 27
Minimum 4
Maximum 124

Sample

1st row Dowa Electronics M...
2nd row Japan Science and ...
3rd row Otsuka Chemical Co...
4th row JSR CORPORATION
5th row Central Glass Co. ...

Letter

Count 14517
Lowercase Letter 10829
Space Separator 1739
Uppercase Letter 3688
Dash Punctuation 37
Decimal Number 8

Patent_industry

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.9%
Missing 0
Missing (%) 0.0%
Memory Size 34.3 KB

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 532
  • The top 2 categories (4, 3) take over 50.0%
  • Patent_industry has words of constant length

University

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Memory Size 34.3 KB
  • The largest value (0) is over 3.03 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 532
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 3.03 times larger than the second largest value (1)
  • University has words of constant length

Patent_count

numerical

Approximate Distinct Count 209
Approximate Unique (%) 39.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 2864.0113
Minimum 1
Maximum 204120
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Patent_count is skewed right (γ1 = 9.1975)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 5
Median 36
Q3 406.25
95-th Percentile 7311.7
Maximum 204120
Range 204119
IQR 401.25

Descriptive Statistics

Mean 2864.0113
Standard Deviation 17736.6762
Variance 3.1459e+08
Sum 1.5237e+06
Skewness 9.1975
Kurtosis 89.7817
Coefficient of Variation 6.1929
  • Patent_count is not normally distributed (p-value 4.581410075476347e-25)
  • Patent_count has 84 outliers

Turnover_lay

numerical

Approximate Distinct Count 234
Approximate Unique (%) 44.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 1.4989e+07
Minimum 0
Maximum 2.7667e+08
Zeros 1
Zeros (%) 0.2%
Negatives 0
Negatives (%) 0.0%
  • Turnover_lay is skewed right (γ1 = 4.5357)

Quantile Statistics

Minimum 0
5-th Percentile 64646.6
Q1 750000
Median 1.706e+06
Q3 4.9575e+06
95-th Percentile 8.0459e+07
Maximum 2.7667e+08
Range 2.7667e+08
IQR 4.2075e+06

Descriptive Statistics

Mean 1.4989e+07
Standard Deviation 4.1376e+07
Variance 1.712e+15
Sum 7.9742e+09
Skewness 4.5357
Kurtosis 22.8973
Coefficient of Variation 2.7604
  • Turnover_lay is not normally distributed (p-value 6.556230388780252e-25)
  • Turnover_lay has 102 outliers

Turnover_2012

numerical

Approximate Distinct Count 159
Approximate Unique (%) 29.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 1.6906e+07
Minimum 0
Maximum 3.7712e+08
Zeros 2
Zeros (%) 0.4%
Negatives 0
Negatives (%) 0.0%
  • Turnover_2012 is skewed right (γ1 = 4.9235)

Quantile Statistics

Minimum 0
5-th Percentile 228333.9
Q1 7.0762e+06
Median 7.0762e+06
Q3 7.0762e+06
95-th Percentile 7.8899e+07
Maximum 3.7712e+08
Range 3.7712e+08
IQR 0

Descriptive Statistics

Mean 1.6906e+07
Standard Deviation 3.788e+07
Variance 1.4349e+15
Sum 8.994e+09
Skewness 4.9235
Kurtosis 29.2709
Coefficient of Variation 2.2407
  • Turnover_2012 is not normally distributed (p-value 6.2280838214056275e-25)
  • Turnover_2012 has 218 outliers

Total_assets_2012

numerical

Approximate Distinct Count 160
Approximate Unique (%) 30.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 2.3011e+07
Minimum 923
Maximum 6.85e+08
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Total_assets_2012 is skewed right (γ1 = 5.747)

Quantile Statistics

Minimum 923
5-th Percentile 259570.55
Q1 8.2372e+06
Median 8.2372e+06
Q3 8.2372e+06
95-th Percentile 1.1608e+08
Maximum 6.85e+08
Range 6.85e+08
IQR 0

Descriptive Statistics

Mean 2.3011e+07
Standard Deviation 6.0884e+07
Variance 3.7068e+15
Sum 1.2242e+10
Skewness 5.747
Kurtosis 40.9524
Coefficient of Variation 2.6459
  • Total_assets_2012 is not normally distributed (p-value 5.322653203485676e-25)
  • Total_assets_2012 has 218 outliers

Employees_2012

numerical

Approximate Distinct Count 131
Approximate Unique (%) 24.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 41702.0695
Minimum 12
Maximum 434246
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Employees_2012 is skewed right (γ1 = 3.6219)

Quantile Statistics

Minimum 12
5-th Percentile 828
Q1 24956
Median 24956
Q3 24956
95-th Percentile 193321.5
Maximum 434246
Range 434234
IQR 0

Descriptive Statistics

Mean 41702.0695
Standard Deviation 66311.2545
Variance 4.3972e+09
Sum 2.2186e+07
Skewness 3.6219
Kurtosis 13.144
Coefficient of Variation 1.5901
  • Employees_2012 is not normally distributed (p-value 2.3397407789493953e-24)
  • Employees_2012 has 174 outliers

R&D_2012

numerical

Approximate Distinct Count 117
Approximate Unique (%) 22.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 874720.8083
Minimum 1211
Maximum 1.0772e+07
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • R&D_2012 is skewed right (γ1 = 4.1591)

Quantile Statistics

Minimum 1211
5-th Percentile 31061.9
Q1 523137
Median 523137
Q3 523137
95-th Percentile 3.6252e+06
Maximum 1.0772e+07
Range 1.0771e+07
IQR 0

Descriptive Statistics

Mean 874720.8083
Standard Deviation 1.4872e+06
Variance 2.2118e+12
Sum 4.6535e+08
Skewness 4.1591
Kurtosis 18.2411
Coefficient of Variation 1.7002
  • R&D_2012 is not normally distributed (p-value 1.4594848026182359e-24)
  • R&D_2012 has 159 outliers

Country_code

numerical

Approximate Distinct Count 11
Approximate Unique (%) 2.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 8.3 KB
Mean 3.797
Minimum 1
Maximum 11
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Country_code is skewed right (γ1 = 0.5139)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 1
Median 4
Q3 4
95-th Percentile 8
Maximum 11
Range 10
IQR 3

Descriptive Statistics

Mean 3.797
Standard Deviation 2.3594
Variance 5.567
Sum 2020
Skewness 0.5139
Kurtosis -0.4025
Coefficient of Variation 0.6214
  • Country_code is not normally distributed (p-value 9.427785848085104e-19)
  • Country_code has 18 outliers

Interactions

Correlations

Missing Values